Crosscorrelation-based multispeaker speech activity detection

نویسندگان

Kornel Laskowski

Qin Jin

Tanja Schultz

چکیده

We propose an algorithm for segmenting multispeaker meeting audio, recorded with personal channel microphones, into speech and non-speech intervals for each microphone’s wearer. An algorithm of this type turns out to be necessary prior to subsequent audio processing because, in spite of close-talking microphones, the channels exhibit a high degree of crosstalk due to unbalanced calibration and small inter-speaker distance. The proposed algorithm is based on the short-time crosscorrelation of all channel pairs. It requires no prior training and executes in one fifth real time on modern architectures. Using meeting audio collected at several sites, we present error rates for the segmentation task which do not appear correlated with microphone type or number of speakers. We also present the resulting improvement in speech recognition accuracy when segmentation is provided by this algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multispeaker Speech Activity Detection for the Icsi Meeting Recorder

As part of a project into speech recognition in meeting environments, we have collected a corpus of multi-channel meeting recordings. We expected the identification of speaker activity to be straightforward given that the participants had individual microphones, but simple approaches yielded unacceptably erroneous labelings, mainly due to crosstalk between nearby speakers and wide variations in...

متن کامل

Improved speech activity detection using cross-channel features for recognition of multiparty meetings

We describe the development of a speech activity detection system using an HMM-based segmenter for automatic speech recognition on individual headset microphones in multispeaker meetings. We look at cross-channel features (energy and correlation based) to incorporate into the segmenter for the purpose of addressing errors related to cross-channel phenomena such as crosstalk. Results demonstrate...

متن کامل

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

Separation of Multispeaker Speech Using Excitation Information

In this paper, we propose an approach for separating speech of individual speakers from a multispeaker speech signal using excitation source information. The proposed approach is demonstrated in a two-microphone case. The main issue in the two-microphone case is the estimation of delay of each speaker. We propose a method for delay estimation in multispeaker case using the knowledge of excitati...

متن کامل

Hidden Markov Model Based Speech Activity Detection for the ICSI Meeting Project

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Crosscorrelation-based multispeaker speech activity detection

نویسندگان

چکیده

منابع مشابه

Multispeaker Speech Activity Detection for the Icsi Meeting Recorder

Improved speech activity detection using cross-channel features for recognition of multiparty meetings

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Separation of Multispeaker Speech Using Excitation Information

Hidden Markov Model Based Speech Activity Detection for the ICSI Meeting Project

عنوان ژورنال:

اشتراک گذاری